41 research outputs found

    Assembling Disease Networks From Causal Interaction Resources

    Get PDF
    The development of high-throughput high-content technologies and the increased ease in their application in clinical settings has raised the expectation of an important impact of these technologies on diagnosis and personalized therapy. Patient genomic and expression profiles yield lists of genes that are mutated or whose expression is modulated in specific disease conditions. The challenge remains of extracting from these lists functional information that may help to shed light on the mechanisms that are perturbed in the disease, thus setting a rational framework that may help clinical decisions. Network approaches are playing an increasing role in the organization and interpretation of patients' data. Biological networks are generated by connecting genes or gene products according to experimental evidence that demonstrates their interactions. Till recently most approaches have relied on networks based on physical interactions between proteins. Such networks miss an important piece of information as they lack details on the functional consequences of the interactions. Over the past few years, a number of resources have started collecting causal information of the type protein A activates/inactivates protein B, in a structured format. This information may be represented as signed directed graphs where physiological and pathological signaling can be conveniently inspected. In this review we will (i) present and compare these resources and discuss the different scope in comparison with pathway resources; (ii) compare resources that explicitly capture causality in terms of data content and proteome coverage (iii) review how causal-graphs can be used to extract disease-specific Boolean networks

    A Resource to Infer Molecular Paths Linking Cancer Mutations to Perturbation of Cell Metabolism

    Get PDF
    Some inherited or somatically-acquired gene variants are observed significantly more frequently in the genome of cancer cells. Although many of these cannot be confidently classified as driver mutations, they may contribute to shaping a cell environment that favours cancer onset and development. Understanding how these gene variants causally affect cancer phenotypes may help developing strategies for reverting the disease phenotype. Here we focus on variants of genes whose products have the potential to modulate metabolism to support uncontrolled cell growth. Over recent months our team of expert curators has undertaken an effort to annotate in the database SIGNOR 1) metabolic pathways that are deregulated in cancer and 2) interactions connecting oncogenes and tumour suppressors to metabolic enzymes. In addition, we refined a recently developed graph analysis tool that permits users to infer causal paths leading from any human gene to modulation of metabolic pathways. The tool grounds on a human signed and directed network that connects similar to 8400 biological entities such as proteins and protein complexes via causal relationships. The network, which is based on more than 30,000 published causal links, can be downloaded from the SIGNOR website. In addition, as SIGNOR stores information on drugs or other chemicals targeting the activity of many of the genes in the network, the identification of likely functional paths offers a rational framework for exploring new therapeutic strategies that revert the disease phenotype

    SIGNOR 3.0, the SIGnaling network open resource 3.0: 2022 update

    Get PDF
    The SIGnaling Network Open Resource (SIGNOR 3.0, ) is a public repository that captures causal information and represents it according to an 'activity-flow' model. SIGNOR provides freely-accessible static maps of causal interactions that can be tailored, pruned and refined to build dynamic and predictive models. Each signaling relationship is annotated with an effect (up/down-regulation) and with the mechanism (e.g. binding, phosphorylation, transcriptional activation, etc.) causing the regulation of the target entity. Since its latest release, SIGNOR has undergone a significant upgrade including: (i) a new website that offers an improved user experience and novel advanced search and graph tools; (ii) a significant content growth adding up to a total of approx. 33,000 manually-annotated causal relationships between more than 8900 biological entities; (iii) an increase in the number of manually annotated pathways, currently including pathways deregulated by SARS-CoV-2 infection or involved in neurodevelopment synaptic transmission and metabolism, among others; (iv) additional features such as new model to represent metabolic reactions and a new confidence score assigned to each interaction

    Complex Portal 2022:New curation frontiers

    Get PDF
    International audienceThe Complex Portal (www.ebi.ac.uk/complexportal) is a manually curated, encyclopaedic database of macromolecular complexes with known function from a range of model organisms. It summarizes complex composition, topology and function along with links to a large range of domain-specific resources (i.e. wwPDB, EMDB and Reactome). Since the last update in 2019, we have produced a first draft complexome for Escherichia coli, maintained and updated that of Saccharomyces cerevisiae, added over 40 coronavirus complexes and increased the human complexome to over 1100 complexes that include approximately 200 complexes that act as targets for viral proteins or are part of the immune system. The display of protein features in ComplexViewer has been improved and the participant table is now colour-coordinated with the nodes in ComplexViewer. Community collaboration has expanded, for example by contributing to an analysis of putative transcription cofactors and providing data accessible to semantic web tools through Wikidata which is now populated with manually curated Complex Portal content through a new bot. Our data license is now CC0 to encourage data reuse. Users are encouraged to get in touch, provide us with feedback and send curation requests through the ‘Support’ link

    MINT, the molecular interaction database: 2009 update

    Get PDF
    MINT (http://mint.bio.uniroma2.it/mint) is a public repository for molecular interactions reported in peer-reviewed journals. Since its last report, MINT has grown considerably in size and evolved in scope to meet the requirements of its users. The main changes include a more precise definition of the curation policy and the development of an enhanced and user-friendly interface to facilitate the analysis of the ever-growing interaction dataset. MINT has adopted the PSI-MI standards for the annotation and for the representation of molecular interactions and is a member of the IMEx consortium

    The IntAct database:Efficient access to fine-grained molecular interaction data

    Get PDF
    The IntAct molecular interaction database (https://www.ebi.ac.uk/intact) is a curated resource of molecular interactions, derived from the scientific literature and from direct data depositions. As of August 2021, IntAct provides more than one million binary interactions, curated by twelve global partners of the International Molecular Exchange consortium, for which the IntAct database provides a shared curation and dissemination platform. The IMEx curation policy has always emphasised a fine-grained data and curation model, aiming to capture the relevant experimental detail essential for the interpretation of the provided molecular interaction data. Here, we present recent curation focus and progress, as well as a completely redeveloped website which presents IntAct data in a much more user-friendly and detailed way

    The MIntAct project—IntAct as a common curation platform for 11 molecular interaction databases

    Get PDF
    IntAct (freely available at http://www.ebi.ac.uk/intact) is an open-source, open data molecular interaction database populated by data either curated from the literature or from direct data depositions. IntAct has developed a sophisticated web-based curation tool, capable of supporting both IMEx- and MIMIx-level curation. This tool is now utilized by multiple additional curation teams, all of whom annotate data directly into the IntAct database. Members of the IntAct team supply appropriate levels of training, perform quality control on entries and take responsibility for long-term data maintenance. Recently, the MINT and IntAct databases decided to merge their separate efforts to make optimal use of limited developer resources and maximize the curation output. All data manually curated by the MINT curators have been moved into the IntAct database at EMBL-EBI and are merged with the existing IntAct dataset. Both IntAct and MINT are active contributors to the IMEx consortium (http://www.imexconsortium.org

    BioCreative III interactive task: an overview

    Get PDF
    The BioCreative challenge evaluation is a community-wide effort for evaluating text mining and information extraction systems applied to the biological domain. The biocurator community, as an active user of biomedical literature, provides a diverse and engaged end user group for text mining tools. Earlier BioCreative challenges involved many text mining teams in developing basic capabilities relevant to biological curation, but they did not address the issues of system usage, insertion into the workflow and adoption by curators. Thus in BioCreative III (BC-III), the InterActive Task (IAT) was introduced to address the utility and usability of text mining tools for real-life biocuration tasks. To support the aims of the IAT in BC-III, involvement of both developers and end users was solicited, and the development of a user interface to address the tasks interactively was requested

    The Protein-Protein Interaction tasks of BioCreative III: classification/ranking of articles and linking bio-ontology concepts to full text

    Get PDF
    BACKGROUND: Determining usefulness of biomedical text mining systems requires realistic task definition and data selection criteria without artificial constraints, measuring performance aspects that go beyond traditional metrics. The BioCreative III Protein-Protein Interaction (PPI) tasks were motivated by such considerations, trying to address aspects including how the end user would oversee the generated output, for instance by providing ranked results, textual evidence for human interpretation or measuring time savings by using automated systems. Detecting articles describing complex biological events like PPIs was addressed in the Article Classification Task (ACT), where participants were asked to implement tools for detecting PPI-describing abstracts. Therefore the BCIII-ACT corpus was provided, which includes a training, development and test set of over 12,000 PPI relevant and non-relevant PubMed abstracts labeled manually by domain experts and recording also the human classification times. The Interaction Method Task (IMT) went beyond abstracts and required mining for associations between more than 3,500 full text articles and interaction detection method ontology concepts that had been applied to detect the PPIs reported in them.RESULTS:A total of 11 teams participated in at least one of the two PPI tasks (10 in ACT and 8 in the IMT) and a total of 62 persons were involved either as participants or in preparing data sets/evaluating these tasks. Per task, each team was allowed to submit five runs offline and another five online via the BioCreative Meta-Server. From the 52 runs submitted for the ACT, the highest Matthew's Correlation Coefficient (MCC) score measured was 0.55 at an accuracy of 89 and the best AUC iP/R was 68. Most ACT teams explored machine learning methods, some of them also used lexical resources like MeSH terms, PSI-MI concepts or particular lists of verbs and nouns, some integrated NER approaches. For the IMT, a total of 42 runs were evaluated by comparing systems against manually generated annotations done by curators from the BioGRID and MINT databases. The highest AUC iP/R achieved by any run was 53, the best MCC score 0.55. In case of competitive systems with an acceptable recall (above 35) the macro-averaged precision ranged between 50 and 80, with a maximum F-Score of 55. CONCLUSIONS: The results of the ACT task of BioCreative III indicate that classification of large unbalanced article collections reflecting the real class imbalance is still challenging. Nevertheless, text-mining tools that report ranked lists of relevant articles for manual selection can potentially reduce the time needed to identify half of the relevant articles to less than 1/4 of the time when compared to unranked results. Detecting associations between full text articles and interaction detection method PSI-MI terms (IMT) is more difficult than might be anticipated. This is due to the variability of method term mentions, errors resulting from pre-processing of articles provided as PDF files, and the heterogeneity and different granularity of method term concepts encountered in the ontology. However, combining the sophisticated techniques developed by the participants with supporting evidence strings derived from the articles for human interpretation could result in practical modules for biological annotation workflows
    corecore